Step 2: Creating audio transcription app

Create a new python file speech2text_app.py

Exercise: Complete the transcript_audio function.

From the step1: fill the missing parts in transcript_audio function.

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  1. import torch
  2. from transformers import pipeline
  3. import gradio as gr
  4. # Function to transcribe audio using the OpenAI Whisper model
  5. def transcript_audio(audio_file):
  6. # Initialize the speech recognition pipeline
  7. pipe = #-----> Fill here <----
  8. # Transcribe the audio file and return the result
  9. result = #-----> Fill here <----
  10. return result
  11. # Set up Gradio interface
  12. audio_input = gr.Audio(sources="upload", type="filepath") # Audio input
  13. output_text = gr.Textbox() # Text output
  14. # Create the Gradio interface with the function, inputs, and outputs
  15. iface = gr.Interface(fn=transcript_audio,
  16. inputs=audio_input, outputs=output_text,
  17. title="Audio Transcription App",
  18. description="Upload the audio file")
  19. # Launch the Gradio app
  20. iface.launch(server_name="0.0.0.0", server_port=7860)
Click here for the answer
  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  1. import torch
  2. from transformers import pipeline
  3. import gradio as gr
  4. # Function to transcribe audio using the OpenAI Whisper model
  5. def transcript_audio(audio_file):
  6. # Initialize the speech recognition pipeline
  7. pipe = pipeline(
  8. "automatic-speech-recognition",
  9. model="openai/whisper-tiny.en",
  10. chunk_length_s=30,
  11. )
  12. # Transcribe the audio file and return the result
  13. result = pipe(audio_file, batch_size=8)["text"]
  14. return result
  15. # Set up Gradio interface
  16. audio_input = gr.Audio(sources="upload", type="filepath") # Audio input
  17. output_text = gr.Textbox() # Text output
  18. # Create the Gradio interface with the function, inputs, and outputs
  19. iface = gr.Interface(fn=transcript_audio,
  20. inputs=audio_input, outputs=output_text,
  21. title="Audio Transcription App",
  22. description="Upload the audio file")
  23. # Launch the Gradio app
  24. iface.launch(server_name="0.0.0.0", server_port=7860)

Then, run your app:

  1. 1
  1. python3 speech2text_app.py

And start the app:

You can download the sample audio file we've provided by right-clicking on it in the file explorer and selecting "Download." Once downloaded, you can upload this file to the app. Alternatively, feel free to choose and upload any MP3 audio file from your local computer.

The result will be:

langchain

Press Ctrl + C to stop the application.